The present disclosure is generally related to devices, systems, methods, and computer programs/algorithms that may be used to process images, documents, and/or texts using, for example, optical character recognition (OCR) and compare the documents to find differences between them.
Currently in document flow, records management, and many aspects of conducting business, one frequently encounters the task of comparing two or more documents that contain text or other information to determine whether they are identical or to find differences in the documents examined. One particular implementation is to compare a copy of a document with its initial version, for example, to exclude the possibility that the document or template was mistakenly or intentionally modified when being completed.
For example, when a contract is entered into after going through a multitude of coordination stages, the following situation is possible. One of the parties to the agreement, some conditionally designated party A, sends a version of the contract to the other party, conditionally designated party B, for subsequent signature. After B has signed the contract, A may wish to ensure that the signed contract corresponds to the initial contract (the original) and does not contain changes or unforeseen corrections, etc. If the entire contract signature procedure is digital using digital signatures, that simplifies the comparison task. However, agreements or other legal documents are frequently signed on paper, after which party A sends either a paper or a scanned (photographed or faxed) copy with a signature.
The task of checking whether documents are identical becomes more cumbersome if a paper version of the document is in the document flow. Currently, this type of problem is addressed by comparing the electronic version and the paper version of the document by hand. As a result, a person (operator) becomes convinced, after careful and meticulous study of the two versions of the document, either that the versions coincide or that they have significant differences. The process becomes noticeably more complicated if there are dozens or hundreds of pages in the contract.
As a rule, to compare printed electronic versions of documents, these documents are converted to text and then it is specifically the text files that are compared. The end result is that the results of this text comparison are shown to the user. However, text comparisons are not always sufficient. In particular, text comparison is insufficient if it is necessary to find discrepancies in layout, coordinates, presence of tables, printing, signatures, stamps or other items, not merely in the text. In addition, for each change in the electronic version, the user needs to find the corresponding places in the two original paper documents by scanning, and then comprehend whether the change found is a significant discrepancy or not.
A method is needed that analyzes documents to identify differences between the documents and presents the results of comparison of the documents in a simple and easily comprehended manner to the user.